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III. Processing Techniques Development. 

This task is divided into two sub tasks: Technology Evaluation and 

Development and Scanner System Parameter Selection. A portion of the Tech- 
nology Evaluation and Development subtask which concerns the documentation and 
evaluation of the unsupervised ECHO (Extraction and Classification of Homo- 
geneous Objects) is published in a separate volume so that analysts interested 
in using that algorithm may obtain its documentation and evaluation without 
acquiring excess pages. The remainder of the Processing Techniques Develop- 
ment task is reported in this volume. 

The Technology Evaluation and Development subtask is comprised of two 
major portions; Technology Development and Technology Interchange System 
Development. The Technology Development portion has been concerned with the 
implementation and evaluation of the LIST (Label Identification from Statistical 
Tabulation) approach to image labelling and the completion of the evaluation 
of the ECHO (Extraction and Classification of Homogeneous Objects) classifiers 
which was begun in FY77. The Technology Interchange System Development por- 
tion of the Technology Evaluation and Development subtask has supported the 
JSC 2780 terminal, the conversion of JSC software to the LARS system, the 
development of the ECHO Analysis Case Study and Data 100 instructional ma- 
terials, and the planning of a short course to increase technique inter- 
change between NASA/ JSC and Purdue/LARS. 

The Scanner System Parameter Selection study has addressed the problem 
of evaluating proposed scanner system designs by developing models for scan- 
ner systems and methods for evaluating the classification error of a scanner 
system for a given remote sensing objective. 

A-1. Technology Development. 

I. Work Accomplished: 

Work on Technology Development during the contract period concentrated 

on: 


•studying the characteristics of the labelling procedure called 
' LIST (Label Identification from Statistical Evaluation) ; 

•bringing together a data set for the study and evaluation of this 
labelling procedure; 

•writing and debugging programs supporting the LIST investiga- 
tion; and 

•formulating the integration of the LIST labelling procedure 
and recent developments in remote sensing technology. 


During this quarter, we have started to study the characteristics of the dot 
labelling procedure called LIST (Label Identification from Statistical Tabu- 
lation) developed by a joint SRT (UCB and ERIM)/LEC effort. This statistical 
approach for estimating dot labels is based in the answers of an analyst to a 
list of questions, with the help of associated ancillary data. The answers 
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are used in a linear discriminant analysis for finding the corresponding 
label. The method has already been tried at JSC, with some encouraging re- 
sults. With the information at our disposition we are initiating a similar 
sequence of procedures in order to evaluate the method and find some possible 
modifications and alternatives. In addition to the general objectives of 
the LIST Method [1] additional goals for this work are: 

• Make the procedures as machine oriented as possible with the 
idea of obtaining a partially (or in the best case totally) 
computer implemented technique. 

• Possibly modify the actual set of questions by restating them 
in a more quantitative form and/or by the addition of new 
questions. This may be done with the idea of improving the 
performance or for obtaining objective (a). 

• Study alternative methods of analysis as well as the linear 
discriminant approach. 

With these objectives in mind the present set of questions were examined. 

Although at the present time all the material necessary for completing 
all the questions is not available at Purdue/LARS, some general comments 
can be made regarding the machine adaptability of the LIST questions . 

• Segment Questions from Imagery: Most of the questions in this 

set have to be answered by a human analyst, making the use of 
an automatic procedure difficult. 

• Cropping Practices: This set of questions requires some ancillary 

data, such as nominal crop calendar and percentage of crops, 

and again the intervention of the human analyst is decisive. 

• Meteorological Data: In answering these questions, the analyst 

must rely primarily on the met summary. In this case most of the 
answers can be quantified, making them more suitable for machine 
processing. 

• Pixel Specific Questions: This set of questions seems to be 

the most important in order to label a specific dot. The 
analyst in this case has to use his knowledge and experience, 
including a familiarity with different kinds of aids such as 
spectral plots, trajectlon plots, green numbers, Kraus product, 
and crop statistical data. Some of the questions can be easily 
quantified but there are others whose answers depend a great 
deal on the analyst and cannot be objectively quantified. 

The knowledge and experience of the analyst can be decisive in 
several cases so that the idea of a totally automatic procedure 
may not be possible. As a result of this preliminary evaluation 
of the LIST questions, an attempt is being made to restate some 
of the questions in a more quantitative way and to supplement 
some subjective questions with more objective measures in order 
to obtain at least a partially automatic procedure. 



Seven LACIE segments in Kansas in the 1976 crop year have been selected 
as the basis for the study of this procedure. These segments (1851, 1856, 
1857, 1860, 1865, 1866, and 1889) were chosen based on the availability at 
LARS of the corresponding full-frame imagery and true dot labels. The LIST 
method has been applied to segments 1857 and 1865. Several problems have 
appeared and are discussed below. 

Programs for computing green numbers and trajectory plots have been 
implemented. Additional forms of displaying the digital information are 
being investigated in order to aid the analyst in his labelling decisions. 

After some review of available algorithms for discriminant analysis, it 
seems that the one available in the SPSS statistical package is, at least as 
starting point, the most suitable, especially as a stepwise procedure for 
selecting the most Important features or variables is possible. However, at 
a later stage, the use of a special purpose algorithm for carry^lng out the 
classification may be better. 

II. Problems Encountered 

The chief problem encountered to date in implementing the LIST method 
has been the difficulty in acquiring the ancillary information for the seven 
segments being studied in Kansas. The following information forms the basis 
for some questions in the LIST method as it is presently formulated, and is 
not currently available at LARS in the form it has at JSC: 

a. percentage of each crop in county 

b. nominal crop calendar 

c. expected normal yield for a segment 

d. DU and DO areas for the segment 

e. Kraus products 

f. crop calendar adjustment information 

g. green number/biostage chart. 

h. examples of small grains trajectory plots and spectral 
development patterns 

i. crop statistical data. 

That information available to the LACIE analyst-interpreters at JSC 
(items a, b, c, f, and i) is being sent to LARS from JSC but has not yet 
arrived. The DV and DO areas for each segment can be estimated from the 
PEC's presently available to us. Duplicates of the Kraus products will be 
requested, after the other information has arrived at LARS and has been in- 
tegrated into the LIST study. It is our understanding that green number/ 
blostage charts and examples of small grains trajectory plots and spectral 
development patterns are not currently part of the analyst aids supplied to 
the AI. If additional sets exist or are under development, copies would be 
of great assistance in this project. 
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III. References 

1. "Plan for Defining Dot Labelling Procedures for Procedure 1, The LIST 
Method", March 7, 1977. 


A-2. Technology Interchange System Development. 

I. Work Accomplished 

Work on technology Interchange system development during the contract 
period concentrated in the following areas: 

• Software Conversion Support 

• Tape Copy Software Plan 

• Support of the JSC 2780 terminal 

• ECHO Analysis Case Study 

• Data 100 system instructional materials 

• LARS System extended tiser facilities short course plan 

• Technique Interchange plan 


During this contract period the concept of a Purdue terminal located at 
the Earth Observation Division of NASA/JSC has matured. This maturing pro- 
cess has changed the installation date of the Data 100 terminal from that en- 
visioned in the Implementation plan approved in June. As a result of these 
actions the terminals and the Purdue personnel supporting them are in a much 
better position to serve the needs of the Earth Observation Division. Another 
result is that some of the subtasks Included in the implementation plan have 
received greater attention than originally envisioned and others have been 
delayed. The subtasks receiving greater attention were the software conver- 
sion support and the support of the JSC 2780 terminal. The subtasks which 
have been delayed are the tape copy software plan, Data-100 installation, the 
tape copy software implementation, and the Data-100 installation evaluation. 

During the past six months the communication between personnel at LARS 
and the personnel at JSC with respect to the Purdue terminal has increased 
significantly. This communication has been centered around the software 
aspects of the terminal especially in support of the computer needs at JSC. 
Several visits have been made by Purdue personnel to JSC to Investigate 
methods of providing better service and to relate specific capabilities of the 
Purdue hardware and software to JSC personnel. Education and consulting de- 
tails have been worked out to make the transfer of software by JSC personnel 
to the Purdue computer as effective and efficient as resources permit. A 
visit by Lockheed personnel to Purdue is planned for early November for the 
same purpose. 

In addition to the exchange of information in person, there have been 
frequent communications between personnel both via written documentation and 
telephone. As a result, there has been a change in computer systems at Purdue 
which has enhanced the terminal's capabilities at JSC. In addition the plans 
for the installation of the Data-100 are well understood by both Purdue and 
JSC personnel. 
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Purdue support of the JSC 2780 terminal has also been increased over the 
project period. The services provided for the first five months were ap- 
proximately $29,000 as compared to the $24,000 budgeted for June through 
November. This increase in services has primarily supported the software 
conversion sub task. 

With the announcement of a specific date for the Data-100 installation, 
a tape copy software plan is receiving attention. It is expected that the 
plan will be completed by November 30 for approval by the JSC personnel. The 
hardware installation and tape copy software implementation are tasks which 
will be completed during the next contract year. 

A case study on the use of the ECHO classifier (Extraction and Classifi- 
cation of Homogeneous Objects) for analyzing multispectral scanner data has 
been completed. The materials prepared for the case study Include a case 
study document LARS Publication 090177 "A Case Study Using ECHO for Analysis 
of Multispectral Scanner Data," a set of Instructor notes, a set of reference 
data consisting of maps and aerial photographs and a sample analysis. The 
case study document introduces the ECHO processing function and typical steps 
in the analysis of remotely-sensed data using ECHO are Illustrated through 
discussion, an Illustrative example and exercises. The instructor notes, 
reference data and sample analysis serve as aids to individuals wishing to 
carry out the analysis steps themselves. 

A new unit of the LARSYS Educational Package, "Data 100 Remote Terminal: 
A Hands-On Experience", has been prepared. This unit of the educational 
package consists of a set of student notes, accompanying audio tape, card 
decks and Instructor notes. These materials were prepared in anticipation 
of a decision by JSC to upgrade their remote terminal through installation 
of a Data 100 system. Availability of this new unit of the educational pack- 
age will allow for immediate training and access to the LARS computation 
facilities by means of the Data 100 terminal. 

An outline of a short course covering topics designed to introduce JSC 
terminal users to the extended user facilities of the LARS computer facility 
has been prepared (see Appendix A-1) . An earlier version of this outline was 
presented to JSC personnel at the quarterly program review held in September 
1977. Since that time the outline has been revised to reflect comments re- 
ceived during the review and work has been initiated in detailing those 
portions of the course identified as being of keen interest to JSC personnel. 
As a result of an early November meeting with Tom Minter of LEC, the division 
of this short course into several one or two day seminars with different 
emphases has been proposed as a more effective way of promoting technique 
interchange. Also proposed were two or more seminars to be given by JSC or 
LEC personnel at LARS to inform and aid LARS personnel in using the software 
presently being converted to the LARS system. 

A two-part plan dealing with the interchange of technical information, 
techniques and procedures between NASA/JSC and Purdue/LARS has been prepared. 

I, dealing with specific retraining needs that would result from an up- 
grade of the JSC/LARS remote terminal, was discussed at the September quarterly 



program review. Part IT deals with the Interchange of technical Ideas and 
techniques from a more general viewpoint. It seeks to identify features and 
conditions common to any technical interchange and suggests an approach for 
facilitating and managing technique Interchange. Parts I and II of this plan 
appear- as appendices A-2 and A-3 of this report. 

II. Problems Encountered. 

Delay In the Data 100 installation decision has made it necessary to 
carry out work under this task in more of a contingency mode of operation 
rather than working towards specific technique interchange goals. While this 
has made it difficult to follow the implementation plan schedule it has re- 
sulted in the development of a broader view of the problem. The net effect, 
we believe, will improve the longer range objectives of the technology inter- 
change system development effort. 


Appendix A-1. Short Course on Purdue/LARS User Facilities Available via a 
Data 100 Remote Terminal 


Day 1 A.M. 


I. Introduction 

Course Outline, Materials 

II. What do I have to do before I can use the LARS system? 
Computer ID's - Passwords" whom to contact 

III. Data 100 demonstration. Exercise 1 - Data 100 Hands-on 
operation. 


Day 1 P.M. 

IV. Remote Terminal Procedures 
Responsible personnel 
How to dial up (when necessary) 
Login procedure 

V. Overview of LARS computer system 
Machine type 
storage capabilities 
operating system 
available environments 
virtual machines 
CP command Q V 
system flowchart 

VI. VM370 CP Commands 

Assessing System 
Controlling files 
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Day 2 A.M. 

I. Review CMS editing 

II. Overview EXECS 

Exercise: Edit in BATCH EXEC 

Day 2 P.M. 

III. CMS utility functions 

IV. Review Classlfypoints processor 

A. Subroutines - system manual 

B. Loading and execution 

V. Modify classlfypoints to become minimum distance classifier 

A. Coding 

B. Creating the module 

C. Placing the module in the appropriate place 

Exercise: go to terminal and get copy of appropriate 

fortran coding and edit in coding changes. 

Day 3 A.M. 

Exercise: After code is modified - create the module 

5 

I. Abstracts 

A) understanding what is happening 

B) knowledge of how to use in other programs 

II. Review new program that plots spectral trajectories 
Exercise: Edit in *TRAJECT coding set up to run 

Day 3 P.M. 

Simulation Exercise: adding CGROUP to SEPARABILITY 

A) Coding 

B) Establishing Proper Loading 

C) Executing 

Day 4 A.M. 

Group Exercise: How to connect BIPLOT and SEIGEN 

to get transformed plots 

I. Recoding required 

Change Appropriate part of SEIGEN to callable subroutine 
Change BIPLOT to call the new subroutine 



II. Loading Necessary - EXEC file 


Exercise: various people edit in various changes 

Day 4 P.M. 

III. With T.I. in classroom - as a group 
try loading - debug as necessary 

When complete 

Try EXECUTION - debug as necessary 


Day 5 A.M. 


I. Continue debugging if necessary 

II. Consultation time - what you want to do and the best 
way to do it 


Day 5 P.M. 

III. Course Summary 

A) Review systems, LARSYS standards manuals 

IV. Other Documentation 

A. 370 Materials: VM370 and CMS 370 

B. Scanlines 



Appendix A- 2 
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Technique Interchange Plan-Part I* 

John C. Lindenlaub 
Purdue/LARS 
September 1 , 1977 

Outlined in this document, the first of a two-part technique 
interchange plan, are plans to meet the specific retraining needs 
that would result from an upgrade of the JSC/LARS remote terminal 
from its present IBM 2780 configuration to a Data 100 system and to 
provide training and experience to JSC personnel in using the LARS 
computer system at a level considerably beyond that covered in the 
LARSYS Educational Package. Included in the plan are provisions for 
training personnel in the use of the new hardware configuration, 
presentation of an ECHO analysis case study, a lecture/workshop 
series on system capabilities, and suggested procedures and require- 
ments to make the field measurements data base accessible to a 
larger group of users. 

By selecting different portions of the plan the level of 
technique interchange can range from learning how to operate 
the new equipment, to a management overview of system capabilities, 
to indepth study and experience in algorithm implementation. 

A subsequent document. Technique Interchange Plan-Part II, 
will address the more general problem of exchanging any technique 
between JSC and LARS or LARS and JSC using the remote terminal 
system. 


* 


Prepared under NASA contract NAS9-14970 
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New Hardware Configuration 

To handle the retraining requirements which will result from 
the installation of new remote terminal hardware equipment at JSC, 
it is proposed to replace Units III and IV of the LARSYS Educational 
Package, with materials entitled Unit III - Demonstration of the 
Data 100 Remote Terminal and Unit IV - The Data 100 Remote Terminal, 
a Hands-On Experience. These new materials will be modeled after 
the existing units III and IV of the LARSYS Educational Package. 

There are several reasons for this approach. First the manpower 
requirements to modify these units of the educational package are 
not large, in fact draft versions of these materials already 
exist. Second, installation of new equipment will impact all present 
as well as future users and training materials patterned after 
the LARSYS Educational Package materials will provide a convenient 
mechanism for training present as well as future users of the 
system. , It is also expected that training of people to utilize the 
remote terminal will be spread out over a considerable time duration 
and it is advantageous to have training materials that can be used 
by individual students and require a minimum of effort on the 
part of training personnel. Furthermore, upgrading of the LARSYS 
Educational Package materials to match the remote terminal hardware 
will preserve the entire LARSYS Educational Package and medce it 
available to new personnel. 

ECHO Classification Procedures 

A case study has been prepared to illustrate ECHO analysis 
techniques. It is proposed to conduct a series of workshops using 
the case study materials to train a group of JSC personnel in ECHO 
analysis techniques. The ECHO case study will be used as part of 
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a more comprehensive series of lectures and workshops dealing with 
use of the remote terminal system. This is discussed in the next 
section. 

A case study detailing ECHO analysis techniques was prepared 
because of the newness for the technology and the relative complexity 
of the algorithm. Use of the case study materials in a series of 
lecture workshops instead of on an individual basis is suggested for 
reasons of economy of scale. It is also expected that further 
experimentation and development of ECHO analysis procedures will be 
undertaken in the future and the case study will provide a solid 
introduction to this analysis technique. 

Efficient system utilization 

A short course fbr users of the LARS computer system at JSC 
has been planned. The course is one week in duration. Composed of 
lecture, and workshop sessions this course is designed to introduce 
the participants to system caped)lllties such as CMS, experimental 
and developmental LARSYS programs, and procedures for placing new 
programs on the system. 

The short course format is particularly well suited because 
system capabilities are docximented in a variety of reference sources. 
An experienced analyst, serving as the short course instructor, 

C 2 m guide participants through these sources and adapt the training 
to meet the requirements of different participants in the course 
who wish to achieve different levels of capability or who wish to 
have different portions of the course emphasized. 

An outline of this proposed short course is shown in appendix A. 
This outline is intended to serve as a point of departure for 
designing the course and it is anticipated and hoped that a JSC 
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terminal user or remote terminal site expert would be availedsle to 
contribute to the finalization of the course design and work jointly 
with the LARS staff member in presenting the course. 

Access to field measurement data 

Suggested procedures for obtaining better access to field 
measurements data by JSC personnel is significantly different than 
procedures suggested in the sections above. If a requirement 
exists on the part of JSC to have better access to field measurements 
data it is suggested that the individuals needing access to this data 
plan to spend a one to t%ro week period at LARS working with LARS 
personnel who are familiar with the software systan and the data 
analysis and collection techniques. Following a reasonable interval 
of, say, four to six weeks, a member of the LARS staff would plan to 
spend three to five days at JSC working with personnel on the remote 
terminal system accessing and analyzing field measurements data. 

This intensive one-on-one instruction is suggested in this 
case because of the limited amovint of documentation available on the 
EXOSYS system and the relatively dynamic nature of the software 
system. It would require considerably more effort to prepare 
suitable documentation if a larger number of JSC personnel were to 
be trained in this area. 

Discussion 

The short course outlined in Appendix A provides the frame- 
work for satisfying a number of technique interchange needs. 
Participation in the morning session of Day One is all that is 
required for individuals desiring merely to learn "ii^at new buttons 
have to be pushed" to operate the Data 100 Remote Terminal. Indi- 
viduals interested in a "management overview" of the remote terminal 
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system can obtain this information by participating in the afternoon 
session of the first day of the course. Individuals wishing to 
learn how to use the system efficiently along with programs which 
are presently on the system would participate in the entire short 
course. Intermingling of the lectrare presentations and computer 
exercises helps provide necessrary experience and reinforcement of 
the ideas presented in the lectures. The series of exercises 
suggested in the outline of Appendix A are geared towards presentation 
of ECHO emalysis techniques. Other exercises could be substituted 
for individuals wishing to emphasize other aspects of the system. 

It is suggested that a LARS data analyst and a JSC site expert 
jointly contribute to the planning and teaching of the course. This 
would increase the likelihood of having sets of examples ^md 
lecture presentations which are particularly valuable to course 
participants. 

NASA Furnished Items 

In order to conduct the proposed course NASA must provide 
space (conference room or class room) and be able to dedicate the 
JSC remote terminal for at least four hours a day for training 
purposes. TeUale or desk space for the LARS analyst instructor in 
or near the terminal area would also be desireable. 

Two man-weeks effort of a JSC computer system specialist or 
remote terminal site expert would significantly improve the training 
coxirse. One week would be spent in joint planning with LARS to 
guarantee that the course content and computer exercises are well 
matched to JSC needs. The second man-week would be used in 
assisting with the course presentation and computer exercises. 
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Schedulinc 


LARS requires a minimum of 30 days advance notice of the dates 


the training course is desired and the number of persons expected 
to participate. 
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Appendix A 

Beyond LARSYS; CMS, LARSYSXP, LARSYDV 
Proposed Short Course for Users of the LARS 
Computer System at JSC 


Day 1 A.M, 


I. Introduction 

Course Outline, Materials 

II. What do I have to do before I can use the LARS system? 

Computer ID's - Passwords - who to contact 

III. Data 100 demonstration. Exercise 1 - Data 100 Hands-on 
operation. 


Day 1 P.M. 


IV. Remote Terminal Procedures 
Responsible personnel 

How to dial up (when necessary) 

Login procedure 

V. Overview of LARS computer system 
Machine type 

storage capabilities 
operating system 
available environments 
virtual machines 
CP command Q V 
system flowchart 

VI. Review general analysis steps 

VII. Overview of Short Course 

examples of how the system can be used more efficiently and 
discussion of advantages 
use of operating system 

assessing system (query) 

control of files (remote, purge, close, set, xfer) 
use of CMS for control cards, submitting batch jobs 
new or revised LARSYS functions 
use of CMS for altering virtual machine 
use of CMS for altering or establishing progreuns 

Day 2 A.M. 

VI I I. CMS - editting 

IX. Control Cards 

where to get listings 

review PICTUREPRINT control cards 

class write control cards needed 

Exercise 2. Using CMS create a disk file containing all 

the cards needed to produce a grayscale image 
from PICTUREPRINT 
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Day 2 P.M. 


X. Othjer CMS functions 

XI . Batch Machines 

Exercise 3. Add necessary batch control cards to PICTUREPRINT 
file and submit the batch job from the terminal 


Day 3 A.M. 


Exercise 4. Assemble grayscale and pick candidate training 
areas. 

XII. Review cluster control cards (DV or XP - IDNAME) 

Exercise 5. Using CMS create cluster control card file £uid 
stibmit to LARSYS on line 

Exercise 6. Identify cluster classes 
Day 3 P.M. 

XIII. More analysis 

merging statistic files (options) 
checking class separability 

Exercise 7. Submit MERGE and SEPARABILITY jobs via card reader 
to batch machine 

Exercise 8. Use separability output to decide how classes 
should be combined 


Day 4 A.M. 

XIV. How to check validity of decisions 
rerun separeibility 
use of BIPLOT and SCATTERPLOT 

Exercise 9. Rerun MERGE and SEPARABILITY if ok run BIPLOT 
and SCATTERPLOT to evaluate further 


Day 4 P.M. 

XV. ECHO vs CLASS IFYPOINTS 
ECHO control cards 

Exercise 10. Run ECHO classification and PRINTRESULTS 
Day 5 A.M. 

XVI . Review results 

XVII. Other output products 
Varian 

Meade 
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XVIII. Other experimental or developmental LARSYS processors 

XIX. Docximentation 
LARSYS users manual 

LARS computer users guide 
Scanlines 

Day 5 P.M. or worked in as. time permits 

XX. CMS - programming EXECS 

XXI. LARSYS system manual 

XXII. Abstracts 

XXIII. You can modify LARSYS programs or write new ones of 
your own. 


Appendix A- 3 
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Technique Interchange Plan-Part II* 
John C. Lindenlaub 
Purdue/LARS 
November 15, 1977 


This is the second part of a two-part dociament dealing with 
the interchange of technical information, techniques and procedures 
between NASA/JSC and Purdue/LARS. The first part of the docviment, 
dated September 1, 1977, dealt with specific retraining needs that 
would result from an upgrade of the JSC/LARS remote terminal . 

This part of the plan deals with the interchange of technical 
ideas and techniques from a more general viewpoint. It seeks to 
identify features and conditions common to any technical interchange 
and suggests an approach for facilitating and managing technique 
interchange. 

The plan described here is an outgrowth of the experience 
gained from the Remote Terminal Experiment [1], several years of 
experience with 2780 terminal capability at JSC and is predicated 
on the assumption that JSC will upgrade their terminal hardware and 
that the JSC/LARS terminal will continue to be the prime motivation 
and facility for carrying on technique interchange activities. 

Examples of the kinds of technical interchanges under consid- 
eration are: 1) transfer of Procedure I analysis techniques from 

JSC to LARS, 2) transfer of capability to access and analyze 
field measurements data from LARS to JSC, 3) transfer of techniques 
for modifying or adding algorithms to the set of experimental 
LARSYS programs from LARS to JSC, and 4) transfer of specific data 
analysis programs (such as clustering algorithm) from JSC to 
LARS. 


Fundamental Parameters 

There are a number of parameters which govern the technique 
interchange process which must be taken into account when planning 
procedures for the interchange of techniques between two technical 
organizations. These parameters are: 

• number of persons to receive technique 

• present state of technology 

• time constraints 


♦Prepared under NASA contract NAS9-14970 
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Understanding these parameters and their interrelationships for 
any particular technique interchange will facilitate planning, 
estimating the success and carrying out a technique interchange 
program. 

The relationship between the degree of technique interchange 
achievable and the state of the technology for a given time constraint 
is basically linear as shown in Figure 1. In this Figure the number 
of trainees has been used as the primary measure of technique 
interchange. While it is recognized that all trainees may not use 
a new technique on a regular basis, the number of trainees is a good 
measure of the throughput capability of a technique interchange 
program. 

For purposes of technique interchange the state of the technology 
may be measured in terms of the kind and amount of documentation 
that is available. As a technology develops, the kind of dociamen- 
tation available tends to span the range from the informal notes 
of the originator through technical reports, conference papers, 
dissertations and journal papers to tutorial materials such as lec- 
ture notes or text material. As each of these various types 
of documentation comes into being there is a corresponding increase 
in detail and clarity which in turn opens the technology to a wider 
audience. 

A niunber of other factors can be superimposed on the basic 
linear relationship between number of trainees and degree' of 
documentation. These are shown in Figure 2. Looking at line 1, the 
type of personnel that can be used to assist in the technique 
interchange process is listed in relation to the state of the 
technology and degree of technique interchange. When the state of 
the technology is low (i.e., recent developments, little documen- 
tation) the only type of individual who can successfully tell 
someone else about the technology is the technical expert who 
developed the technology. At most this person can instruct only 
a few other individuals. As a technique develops understudies of 
the originator (colleagues, graduate students) become familiar 
enough with the technique to be able to explain it to others, thus 
providing the potential for training a larger number of people. 

When technical counterparts at other organizations learn enough 
about a technique that they can begin to pass the information on 
to their colleagues the number of potential trainees is expanded even 
further. A technique important enough to warrant having a person 
spend a significant portion of his or her time explaining or 
instructing in the technique results in the development of persons 
to act as tutors. Finally as the technology becomes fully developed 
with accompanying tutorial documentation individual learners can 
be counted on to learn about the new technology on their own. 

Line 2 on Figure 2 illustrates the type of documentation and 
visual materials that are generally available as a new technique 
developes. When the only available documentation are the original 
notes of the technical expert who originated the new technique only 
a limited amount of technical interchange can take place. Technical 
reports and their illustrations allow a larger number of people 


Many 
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to have access to the new idea or method. Tutorial reports that can 
be used by technical counterparts at other organizations permit 
access to a wider audience. Such documentation is usually complete 
enough to transfer the technique to personnel at other organizations. 
Development of slide-tape or videotape presentations makes it 
possible to schedule regular classes or permit individual learners 
to have access to the technology on a demand basis. A text book 
in a technical area essentially makes the technology available to 
the world. 

Line 3 of Figure 2 shows different types of instructional 
methods in relation to the type of personnel used to transfer the 
technology and the type of documentation/illustrations available. 

The originator of a technique can work from his notes and by 
means of a discussion with one or two other technically qualified 
people transfer the technique to them. Larger groups can be handled 
in a seminar setting. This format works best if there is at least 
a technical report available around which the discussion can be 
centered. The lecture format can be used for still larger groups 
but because there is less opportunity for discussions, written 
material in the form of tutorial reports should be available. 

When tutors are available mediated lectures can be used. This 
facilitates making the lectures available on a repeating basis and 
increases schedule flexibility. With sufficient tutorial documen- 
tation individual learners can acquire knowledge about a new 
technique through self-study. 

Figure 1 illustrates the relationship between the state of 
development of a new technique and the number of persons to 
receive the technique. Figure 2 shows personnel, documentation 
and teaching method parameters superimposed on this basically 
linear relationship allowing one to visualize these interrelation- 
ships. It is important to restate that these Figures show the 
relationship between the state of the technology and degree of 
technique interchange for a fixed time constraint. Many people 
could learn about a new technique directly from the originator 
through a series of seminars which is repeated over and over again. 
However, unless this person's role in the organization is going 
to change, this usually would not be an appropriate course of 
action. Rather, if a large number of people is required to have 
access to the technology, an investment should be made in personnel 
and materials to accomplish the job. 

Time is another fundamental parameter influencing technique 
interchange. Time constraints can be of several forms: 

How soon do people need to be trained? 

Will they be trained in a group or individually? 

Will training availability be required over an extended 

time? 

All of these quesitons must be addressed when planning a technique 
interchange. 
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Planning a Technique Interchange 

The steps in planning a technique interchange may be summarized 
as follows: 

• Determine the number of people desiring access to 
the technique 

• Determine temporal constraints 

• Assess present state of documentation 

• Decide on most appropriate instructional format 

• Prepare additional documentation as needed 

• Offer training 

• Evaluate success of technique interchange 

When planning a technique interchange the fundamental parameters 
discussed in the previous section should be kept in mind. Deter- 
mining the number of people requiring access to the technology 
provides information for making a decision as to what is the most 
desircible kind of dociunentation to have. If only a few (3 to 4) 
people are involved, it does not make sense to write a text book. 

If 60 people need to be trained some sort of tutorial docxamentation 
is necessary. 

Temporal constraints have a large impact on planning a 
technique interchange. If immediate training is needed there will 
not be time to produce any significant amount of documentation 
beyond that which is already available. This may necessitate 
following a procedure which is known to be suboptimum. As an 
example, a situation may require 40 persons to be trained in the 
use of a new analysis algorithm on very short notice. If the only 
docvimentation available consists of a technical report describing 
the theoretical basis of the algorithm and a brief description 
of the kind and format of input data and variables required, one 
would not expect to get well-trained, competent users of the new 
algorithm by exposing them to a few hours of lecture. However, 
if this is the only choice available, it will be the one used. 
Therefore, one should be aware that the training program is 
suboptimum and judge the results accordingly. 

If training is required to be available over an extended 
time, it is desirable to prepare special instructional materials. 

This can best be accomplished by using the experience of one 
presentation to improve upon the materials for the next presentation. 
Thus, over a period of time a good set of tutorial materials will 
evolve. 

Having determined the nvuntiber of people requiring access to 
the technique and the temporal constraints, one should make an assess- 
ment of the present state of documentation. This assessment will 
reveal whether or not a successful technique interchange can take 
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place. By examining the present state of documentation in 
relation to the number of people requiring access to the technique, 
a decision can be made as to. whether or not additional documenta- 
tion is required and whether it is possible within the time 
available. 

Based upon the information obtained so far, a decision can 
be made on the most appropriate instructional format. If little 
documentation is available and time constraints require that instruc- 
tion take place immediately, a one-on-one tutorial format or small 
group seminar would be most appropriate. If the seminar format 
cannot accomodate the nvimber of people required to receive instruc- 
tion on the new technique, it may be necessary to repeat the 
seminar a number of times. While lack of documentation may 
suggest the use of a seminar format, a large number of students 
would require that this seminar be repeated many times. This in 
itself is time consuming. In that situation it may be more 
advantageous to prepare additional documentation for use in a 
lecture format. 

Having assessed the present state of documentation, decided on 
the most appropriate instructional format, and prepared additional 
dociamentation as required, the training program may be offered. 

The final step in planning a technique interchange is to 
devise a method for evaluating the success of the technique 
interchange. One of the best ways to achieve this is to set down 
operational or behavioral objectives for the training program and 
a series of tests or exercises to determine whether or not trainees 
have met these objectives. Questionnaires may also be used to 
evaluate the technique interchange program. 


Guidelines for Carrying Out a Technique Interchange 

To carry out a technique interchange it is important to 
identify key individuals in each organization to oversee the planning, 
preparation, technical interchange, and evaluation steps of the 
process. Regular communication between these two individuals will 
help to insure that the expectations of personnel within both 
organizations are realistic. 

One of the planning steps described above was to decide on 
the most appropriate instructional format. This decision cannot 
be made independent of time and documentation constraints so that 
one cannot specify absolute guidelines for choosing the proper 
instructional format. In the absence of any time constraint and 
assuming necessary documentation is available. Figure 3 provides 
useful information for choosing an appropriate instructional format. 
One-on-one tutoring works quite successfully if the number of parti- 
cipants is rather small, one to six people. A seminar consisting 
of discussion and/or workshops works well for mediiam size groups, 
say four to fourteen people. Lectures coupled with individual 
exercises can be used quite successfully with groups as large as 35. 
Special techniques, usually involving skillfully prepared self- 
study materials, are required when a very large number of individuals 
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is to participate in the program. 

A number of schedule factors must be accounted for when 
carrying out a technique interchange. These factors include: 

• The date on which training activities are to begin 

• Amount of time needed for training 

• Time span over which training will take place 

• Replication of training programs as required. 

In establishing a date on which training is to begin, allowance 
must be made for materials preparation. Guidelines for the 
preparation of various types of materials are discussed below. 
Consideration should also be given to human factors when planning 
a training schedule. For instance, if it is estimated that a 
particular program will require 16 hours of lecture instruction, 
the program is likely to be more successful if this is spread 
out over a 4-day period rather than requiring participants to sit 
in 8 hours of lecture two days in a row. This is especially 
important in situations when participants in the program have other 
duties. By using a mixture of instructional formats, such as 
alternating lectures with workshop periods, it is possible to carry 
out more effective "full-time" training programs than if a single 
instructional format were used. If scheduling permits, the time 
span over which the training program will take place should be two 
to three times longer than the amount of formal instructional time. 
This will allow time for participants to attend to other duties or 
to review difficult concepts. 

The amount of time required for the preparation of different 
types of instructional materials is shown in Figure 4 . These time 
estimates are based on the assumption that the person preparing the 
materials is devoting 1/4 to 1/3 time effort on the materials 
preparation project. The lead times indicated also allow for the 
review of materials by technical colleagues and, in the case of 
mediated materials, time for art work, photographic services, etc., 
has been included in the estimate. 

Whenever possible, the preparation of instructional materials 
should be carried out in 4 steps: 

• Draft materials 

• Technical review 

• Student tryout 

• Rewrite 

The materials may be drafted by the originator of the technology 
or an associate who is familiar with the area. Technical review 
should be carried out by the originator of the technique or a person 
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who is thoroughly familiar with the technical aspects of the 
material to be presented. The material should then be reviewed by 
a typical "student", often a junior staff member working in a 
related area. Based upon the technical review and student tryout, 
materials are rewritten prior to use with the training group. 

Once the instructional format has been decided, scheduling 
taken care of, and necessary materials prepared, the technique 
interchange process can begin. Personnel serving as instructors 
should be sensitive to participants' reaction and their ability to 
absorb the material as presented. "On line" modifications should 
be made as required. A post-training follow-up and evaluation pro- 
vides valuable input for replication of the training program or the 
planning of other technique interchanges. 


Summary and Recommendations 

This part of the technique interchange plan has identified and 
discussed fundcimental parameters which should be taken into account 
when planning procedures for the interchange of techniques between 
two technical organizations. Steps for planning a technique 
interchange were siimmarized and discussed and guidelines for carrying 
out a particular program were presented. 

It is recommended that this document be used as a planning 
aid for two trial technique interchange programs. One program 
should involve transfer of techniques and capabilities from 
Purdue/LARS to NASA/JSC. The Other program should involve 
transfer of techniques from NASA/JSC to Purdue/LARS. Candidate 
subject matter material for the former technique interchange include 
the computation capabilities available to JSC via the remote terminal 
and the LARS computer facility. Candidate subject matter for the 
latter technique interchange is the PI analysis procedure and its 
support computer programs. 


Reference 

Phillips, T.L., H.L. Grams, J.C. Lindenlaub, S.K. Schwingendorf , 
P.H. Swain and W.R. Simmons. Remote Terminal System Evaluation. 
LARS Information Note 062775. 
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B. Scanner System Parameter Selection 

I. Introduction 

The Scanner System Parameter Selection project consisted of three tasks 
plus planning and reporting during the contract period. These tasks are part 
of a program developing analytical and simulation models for remote sensing 
systems. The models are intended to permit evaluations of parameter sets and 
to enable optimization of scanner system design for a given remote sensing 
task. Progress on those tasks is detailed in the following sections. The 
task numbers refer to those defined in the implementation plan submitted for 
this contract. 

II. Task 2. Test and Evaluate Classification Error Prediction Algorithm. 

In the previous contract a classification error estimating algorithm 
was developed and applied to multlspectral data, specifically the data de- 
veloped for the thematic mapper simulation study [1]. Appropriate comparisons 
were reported with favorable results. In this contract a more complete and con- 
vincing evaluation of the error estimating algorithm was conducted. Multl- 
spectral Landsat data was classified and the resulting classification accuracy 
was compared with the output of the error predictor. Three test areas were 
selected; (1) Ogle County, Illinois, (2) Graham County, Kansas, and (3) 

Grant County, Kansas. 

a. Ogle County, Illinois. 

This data is a portion of Landsat scene 1017-16093 acquired August 9, 

1972, and has a LARS runtable entry of 72032806. Three training classes were 
used and classification was performed using four spectral bands, i.e. channels 
1 thru 4. Table 2b-l. shows both the classification accuracies obtained using 
the LARS point classifier and the error prediction algorithm estimates. 


Table B-1. Classification Performance Comparison for Ogle County, Illinois, 
August 9, 1972. 


Class 

No . Points 

1 

Pt. Clsf. 

Error Prediction 
Algorithm 

Com 

411 

87.3 

91.7 

Soybean 

224 

90.6 

91.3 

Other 

217 

94.0 

90.6 

Overall 

852 

90.7 

91.2 
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b. Graham County Kansas. 

This data set is LACIE SRS segment 1018 and has a LARS runtable entry of 
74028500. Channels 9 thru 12 or the acquisition corresponding toLandsat scene 
1672-1644, were used. Four training classes were developed from 229 training 
fields. Results are tabulated in Table B-2. 


Table B-2. Classification Comparison for Graham County, Kansas, May 26, 
1974. 


Class 

No . Points 

Pt. Clsf. 

Error Prediction 
Algorithm 

Baresoil 

443 

65.9 . 

78.3 

Com/ Sorghum 

99 

89.9 

91.0 

Pasture 

1376 

98.4 

95.1 

Wheat 

459 

94.8 

93.9 

Overall 

2377 

87.2 

89.6 


c. Grant County, Kansas. 

This data set is LACIE SRS segment 1036 and hcis a runtable entry of 
74027600. Channels 5 thru 8 or the acquisition corresponding to Landsat 
scene 1655-16512, were used in the classification study. Five training 
classes were developed from 388 training fields. Results are tabulated in 
Table B-3. 


Table B-3. Classification Comparison for Grant County, Kansas, May 9, 
1974. 


Class 

No. Points 

Pt. Clsf. 

Error Prediction 
Algorithm 

AG 1 

793 

52.3 

59.3 

AG 2 

446 

75.8 

73.3 

AG 3 

134 

90.3 

88.8 

Nonfarm 

762 

94.9 

90.5 

Wheat 

930 

82.7 

79.7 

Overall 

3065 

79.2 

78.3 
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d. Simulation of Graham County Statistics. 

A further test was conducted on the following basis: Normality of the 

statistics of the multispectral data is generally accepted feature. Whenever 
a new method is developed, however, its performance cannot be evaluated satis- 
factorily because any deviations from the desired results could be attributed 
to the non-normality of the particular data set. If this element of uncertainty 
could be eliminated from the analysis, then any inadequacies can be traced 
back to the algorithm rather than to the non-normality of the data. 

This goal is achieved by generating synthetic normal data which has the 
same statistics as the multispectral data but is statistically Gaussian. The 
algorithm that accomplishes this task uses the statistics of an already class- 
ified area and generates random numbers having appropriate class statistics. 
There is a one-to-one correspondence between the field coordinates in the 
simulated data and the original data. This simulated data is classified by 
LARS point classifier and its classification accuracy compared with the error 
predictor algorithm estimates. Results are shown in Table B-4. 


Table B-4, Comparison of Classification Performance Using Simulated Data. 


Classes 

Pt. Clsf. 

Error Prediction 
Algorithm 

Class 1 

50.1 

69.0 

Class 2o 

86.9 

86.0 

Class 3 

92.9 

90.6 

Class 4 

89.1 

84.9 

Overall 

79.7 

82.6 


These results are not as conclusive as expected. The LARSYS and error estima- 
tion model results were judged not to be close enough throughout the various 
classes. In examining the histograms of the arltificlal data, it was noted that 
the statistics are not as close to a normal distribution as we expected. The 
following discussion explains this result. The simulation algorithm generates 
the data while conserving a geometrical correspondence with the real data. It 
is true that the total number of points in the entire class is normally dis- 
tributed; however, only the training fields are used in classification and 
error probability calculation. Training fields, being a subset of the entire 
class, did not exhibit normality to the degree desired. Moreover, their 
statistics, if recomputed, showed deviations from the desired statistics. 
Therefore, a different simulation process was examined. This algorithm does 
not preserve any spatial correspondence between the real and artificial data. 
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It generates a specified number of pixels according to the given distribution. 
The two histograms are shown in Figure B2-1. The newly generated artificial 
data was classified by the maximum likelihood point classifier. 

Error predictor estimates being a function of the class statistics alone, 
were unchanged from the previous case. The comparison is given in Table B-5. 


Table B-5. Comparison of Classification Performance of Point Classification 
of Simulated Data and the Predicted Error. 


1 

Class 

Point 

Classification 
Simulated Data 

Error 

Prediction 

Algorithm 

Accuracy 

Difference 

1 

Bare Soil 

77.8 

78.3 

0.5 

Com 

91.2 

91.0 

0.2 

Pasture 

95.3 

95.1 

0.2 

Wheat 

94.2 

93.9 

0.3 

Overall 

89.6 

89.6 



The classification accuracies obtained by these two independent methods 
is extremely close. Slight variation in error rate will occur in the point 
classification of simulated data results if a different set of random data is 
used in classification. Table B-6 shows the probability of correct classifi- 
cation for three cases when a different starting point is specified for the 
random number generator. 


Table B-6. Comparison of Point Classification of Simulated Data with Different 
Initial Conditions on Random Number Generator 


Class 

Random 
start //I 

Random 
start ^^2 

Random 
start //3 

Bare Soil 

77.0 

79.7 

79.0 

Com 

91.2 

92.1 

91.0 

Pasture 

94.8 

96.1 

95.0 

Wheat 

94.0 

94.2 

94.8 

Overall 

89.2 i 

90.5 

90.0 
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Comparison of Simulated Data Histograms Before and After 
Software Modification. 
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The overall classification accuracies obtained by LARSYS varies slightly 
above and below the error prediction algorithm result. This example illustrates 
two points. One is that the error model estimates are close to the classifi- 
cation results from simulated normal data. The second and more interesting 
point is that by using this estimator we were able to predict the non-normality 
of the original data or, more likely, the selection of a set of training fields 
not completely representative of the entire class. Although it is widely as- 
sumed that multispectral data has a normal distribution, in practice this as- 
sumption is not completely satisfied. Using the results just reported we can 
investigate the effect of non-normality of the data in the classification 
accuracy. 

Comparison of the entries in Table B2-2 and Table B2-5 show that the 
classification of the original Landsat data is about two percent less accurate 
than the classification of the simulated normally distributed data. This could 
be the result of either the violation of the normal assumption in the original 
data or the lack of representativeness in the training fields. 

d. Conclusion. 

It was intended through these test runs to further validate the classifi- 
cation error prediction algorithm for obtaining correct classification ac- 
curacies. In the previous reports, correct classification accuracies were in 
the high 90% range and comparable results were obtained for both methods. The 
regions analyzed in this report, exhibit classification accuracies in the high 
70, 80 and low 90% range. In all of the test runs, overall classification 
accuracies obtained through the error prediction method compared to the LARS 
point classifier was well within the analyst’s tolerance; therefore making 
this method a viable alternative to the present classification scheme and a 
necessary tool in theoretical analysis of a multispectral scanner system. 

III. Task 4. Karhunen-Loeve and Information Theory Scanner Model Development. 

The goal of this research task is to develop an analytical procedure 
that will establish a theoretically optimal remote sensing system design. 

For a given scene, S, the class, of all possible spectral response functions 
in the scene is represented by a finite set of the possible waveforms. The 
goal is to arrive at an optimum representation of the scene by selecting sample 
response functions from the scene to represent the Information classes within 
the scene. In addition, each waveform is represented in a form convenient 
for analysis. If the scene has been represented accurately, the information 
necessary to design and evaluate a classifier is available. The particular 
emphasis in this task is to use this procedure to design and evaluate possible 
sets of wavebands for sensors. This approach will allow the selection of the 
optimal set of features for all possible remote sensing problems and provide 
a standard for comparison of suboptimal systems. It should be pointed out 
that the procedure is to be repeated over many scenes such that the final 
evaluation extends over all possible scenes that may be observed by the sensor. 

Two forms of spectral modeling were pursued. One is based on the well 
known Karhunen-Loeve expansion. The K-L expansion of a random process has the 
property that a waveform from the process can be represented by a linear com- 
bination of orthonormal basis functions with minimum mean square error. 
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The second method uses information measures taken from formal information 
theory. The results for the K-L approach will be discussed first. 

a. Karhunen-Loeve Approach. 

The minimum mean square error criterion will be the optimality criterion 
by which the quality of the K-L representation will be measured. The kernel of 
the integral equations which must be solved to obtain the orthonormal basis 
functions is the covariance of the random process for the scene S. The co- 
variance is unknown, a priori and must be estimated from a finite set of wave- 
forms. It is at this point that the choice of waveforms to represent the 
scene becomes Important in the analysis. 

It has been shown that increasing the number of measurements on a wave- 
form may actually decrease the performance of a pattern recognition system 
[2,3]. Therefore, it is expected that an increase in the number of represen- 
tation terms may increase the mean square error. In our procedure we want to 
account for the finite sample size and its affect on the number of basis func- 
tions that will be selected. 

At this time an initial test data set has been assembled using 150 samples 
from three classes taken over Williams County, North Dakota, in August, 1975. 

An equal number of waveforms were taken from each of the classes wheat, fallow, 
and pasture. A software system has been set up to estimate the covariance for 
the waveforms, compute and order the orthonormal basis functions, and transform 
the original waveforms to finite dimensional vectors. 

Once the transformation to finite dimensional vectors has been made the 
problem becomes a classical multivariate analysis problem. We assume that 
the process is Gaussian. The methods of estimation and classification for 
the multivariate Gaussian problem are well known. 

While the overall procedure has been outlined, there are several steps 
that need further investigation. First, a systematic procedure is needed to 
select the waveforms to represent the scene. A better understanding of the 
effect this selection process has on the analysis is needed. A second problem 
is the relationship between the mean square representation error and the number 
of sample functions available. And third, the relationship between the repre- 
sentation error and Bayes classification error has not been established. Since 
classification error is a common performance measure in the pattern recognition 
literature, an analysis of this relationship is important. 

During this contract period the software system was applied to the initial 
data set. The first four basis functions are shown in Figures B-2 through B-5. 
Improvements to the software system such as the graphing capability and better 
numerical techniques have been implemented. 

b. Information Theory Approach. 

The information theory approach is based on modeling the spectral re- 
sponse of a scene as a portion of a realization of a stochastic process in 
wavelength. This model is then used to evaluate the average (mutual) informa- 
tion for different bands of observed spectral scenes. Previous reports con- 
tain outlines of particular approaches taken to various aspects of the problem. 
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Figure B-4. Plot of Third Eigenvector of Spectral Ensemble Containing 
Wheat, Fallow, and Pasture Classes. 
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Figure B-5. Plot of Fourth Eigenvector of Spectral Ensemble Containing 
Wheat, Fallow, and Pasture Classes. 
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In particular, the relation of average information and Weiner filtering is 
outlined. 

1. Modeling the Spectral Response of Different Scenes. 

A major problem to be solved is finding adequate models for the spectral 
response of different scenes. To demonstrate the technique used in this re- 
search, models for two different types of spectral scenes are identified. One 
spectral scene is wheat, and the other spectral scene is an average spectral 
response for several agricultural crops combined. This combined spectral 
scene consists of: (1) oats, (2) barley, (3) grass, (4) alfalfa, and (5) fallow 

fields. These spectral scenes were arbitrarily divided into the spectral 
bands shown in Table B-7. 


Table B-7. Wavelength Limits for the Spectral Bands. 




Wheat 

Combined Scene 

Band 

1 

.4528 ^ 

.5380ym 

.4565 - .5402ym 

Band 

2 

.5380 - 

.6239ym 

.5402 - .6246ym 

Band 

3 

.6239 - 

. 7097vim 

.6246 - .7097ym 

Band 

4 

.7097 - 

. 8517)im 

.7097 - .8481ym 

Band 

5 

.8517 - 

.9910ym 

.8481 - .9850ym 

Band 

6 

.9910 - 

1. 130ym 

.9850 - 1.122ym 

Band 

7 

1.130 - 

1.344ym 

1.122 - 1.307ym 

Band 

8 

1.446 - 

1. 821ym 

1.451 - 1.818ym 

Band 

9 

1.959 - 

2. 386ym 

1.967 - 2.386ym 


For each band, three different model types of several degrees of complexity 
are hypothesized. An exception is Band 1 of the combined scene; this exception 
will be disctissed later. The first model is the autoregressive (AR) model of 
order n defined by 

y(k) = a y(k-l) + a y(k-2) + ... + a y(k-n) + u)(k) (1) 

1 ^ n 


where 

y(k) is the value of the spectral response at the discrete wavelength 

k; 

a^ , j=l» ..., n are coefficients to be identified; and 

o)(k) are independent, identically distributed samples of a zero mean 
gausslan random proce.ss of variance a. 
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The second type of model is the autoregressive plus constant (AR+C) model 
of order n defined by: 

y(k) = c + a^y(k-l) + ... + a^y(k-n) + m(k) (2) 


where 


C is a constant to be Identified; and the other parameters are the 
same as in the first model. 

The third model is the integrated autoregressive (LAR) model of order n. This 
model may be written as 

Vy(k) = a ,Vy(k-l) 4- ... + a 7y(k-n) + 0 )(k) (3) 

1 n 


where 


7y(k) = y(k) - y(k-l); 

Other parameters are as previously defined. 

Another model is used for Band 1 of the combined scene. This model is used 
because the above three models could not be validated for Band 1 of the com- 
bined scene. The model used is an extension of the integrated autoregressive 
model, and is denoted as an integrated autoregressive of the second kind (IAR2) 
of order n. It is defined by 

V2y(k) = aj72y(k-l) + ... + 72y(k-n) + o)(k) (4) 


where 


V 2 Y(k) = y(k) - y(k-2); 

Other parameters are as previously defined. 

An excellent discussion of these models is given by Kashyap and Rao [4, Chap. 

3]. 


The identification procedure for these models consists of estimating the 
coefficients a., j=l,...,n such that the model gives the best fit to the 
actual measurement data of the spectral bands. In this study, maximum likeli- 
hood identification techniques are utilized. Reference [4, Chap. 6] gives 
details. 

For the wheat scene, each of the first three models are identified for 
orders n=l,...,10 for each of the spectral bands. The combined scene pro- 
duced data that is not as smooth as that for the wheat scene. Therefore, 

the first three models are identified for orders n=l 15 for each of the 

spectral bands. The fourth model is identified for orders n=l, ..., 15 for 
Band 1. Hence approximately 700 possible models are identified. Of these 
models, one was selected to represent each band. 


B-14 


2. Selection of a Model for Each Spectral Band. 

Selection of a model for a particular band Is based on a criterion that 
Included goodness of fit and reflected the principle of parsimony. Parsimony 
means that the model with the smallest number of parameters that adequately 
represents the spectral process should be used. The principle of parsimony 
Is discussed In references [4, Chap. 8] and [5]. Each selected model Is then 
subjected to various validation tests on assumptions about the model and 
similarity between statistical characteristics of the actual measurement 
data and simulated data generated from the model. These tests are discussed 
by Kashyop and Rao [4, Chap. 8]. It was during these validation tests that 
It was discovered that the model given by equation (4) was necessary to repre- 
sent Band 1 of the combined scene. Only after passing all validation tests 
Is a model accepted as representative of the spectral response process of a 
particular spectral band. 

Based on the above techniques, the models Identified as representing 
the spectral response processes of the respective spectral bands are shown 
In Table B-8. 


Table B-8. Models for the Spectral Bands. 


Band 

Wheat models 

Combined 
scene models 

1 

AR(6) 

IAR2(11) 

2 

AR(2) 

AR(2) 

3 

lAR(lO) 

lAR(ll) 

4 

AR(1)+C 

AR(1)+C 

5 

AR(1) 

AR(3) 

6 

AR(2)+C 

AR(1) 

7 

IAR(9) 

AR(9)+C 

8 

IAR(9) 

IAR(8) 

9 

IAR(6) 

AR(1) 


In the above table 


AR(n) 

AR(n)+C 

lAR(n) 

IAR2(n) 


= autoregressive model of order n. 

= autoregressive plus constant model of order n. 

= Integrated autoregressive model of order n. 

= Integrated autoregressive model of the second kind of order n. 


These models are Interesting In their own right. They give dynamic models for 
the spectral response In each band for the two different scenes. As mentioned 
In previous reports, these models are used to study the Informational charac- 
teristics of the spectral bands. The models for the spectral bands are 
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foraulated in the above manner to take advantage of their useful computational 
properties. Let the observed spectral scene z(k) be written as 

z(k) = y(k) + v(k) (5) 


where k is a discrete wavelength in the spectral interval (or spectral band) 
[Ai,X 2 ] of interest. The term v(k) represents the observation noise that is 
present at the multispectral scanner. Kalman filtering techniques may then 
be used with the models identified for the spectral bands to compute the 
average (mutual) information in the received spectral process z(k) about 
the spectral response y (k) . These computations will aid in determining which 
bands in a spectral scene contribute the most average information. 

3. Calculation of Average Information 

The required Kalman filter expressions have been implemented on the LARS 
computer system for this study. Using this implementation average Information 
for each of the different spectral bands for both types of scenes is computed. 
For demonstration purposes, the same value for the variance of v(k) is used 
in all spectral bands for both scenes. This may not be entirely realistic 
since different noise disturbances may be expected in different spectral 
bands. However, it is thought that for first comparisons a constant variance 
for v(k) is useful. Thus the results for the above computations are shown 
in Table B-9. 


Table B-9. Average Information for the Spectral Bands. 


Average Information Average Information 

Band Wheat Scene Combined Scene 


1 

17.6 

nats 

30.3 

nats 

2 

4.9 

nats 

9.0 

nats 

3 

11.7 

nats 

17.0 

nats 

4 

17.7 

nats 

26.2 

nats 

5 

36.6 

nats 

32.1 

nats 

6 

24.0 

nats 

28.5 

nats 

7 

43.3 

nats 

72.7 

nats 

8 

35.2 

nats 

54.6 

nats 

9 

34.1 

nats 

63.8 

nats 


(Note: "nats" is the unit of measure resulting from the use of natural 

logarithms. The unit of measure may be changed to "bits" by converting to 
logarithms of the base 2) . 
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In the above table not all the spectral bands have the same spectral band- 
width. Thus, it is of Interest to compare these results with those given 
in Table B-10 for the case of equal spectral bandwidths. 


Table B-10 Average Information for Equal Spectral Bandwidths. 


Band 

Average 

Wheat 

Information 

Scene 

Average Information 
Combined Scene 

1 

17.6 

nats 

30.3 

nats 

2 

4.9 

nats 

9.0 

nats 

3 

11.7 

nats 

17.0 

nats 

4 

16.1 

nats 

24.3 

nats 

5 

36.6 

nats 

32.1 

nats 

6 

24.0 

nats 

28.5 

nats 

7 

32.1 

nats 

51.9 

nats 

8 

28.5 

nats 

45.0 

nats 

9 

24.2 

nats 

45.9 

nats 


It is thought that the average information in the combined scene is higher 
in absolute terms than the wheat scene because the variation in the spectral 
response of the combined scene is higher. That is, the spectral response in 
wheat scene is "smoother" than in the combined scene. The relative value of 
the average information between different spectral bands within each scene 
type is- the more important parameter. These relative values determine which 
subset of spectral bands provide the maximum average information for observing 
a particular scene type. The variations of average information among spectral 
bands are important topics for further study. 

4. Further Investigations 

These Information theoretic ideas will be pursued further. A particular 
avenue of approach is to calculate average Information in the different spectral 
bands for different noise levels v(k). Models for spectral scenes other than 
wheat and the combined scene merit study. Relation of average Information and 
the classification of observed spectral scenes needs further research. These 
studies should eventually lead to better analytical understanding of some multi- 
spectral scanner parameters. 

In order to more fully explain the information theory approach being pur- 
sued for scanner modeling a brief tutorial discussion has been prepared. This 
material is presented as an appendix to Section B2. Scanner System Parameter 
Selection. 

IV. Tasks 3 and 6. Spatial Modeling and Noise Modeling. 

The effort in spatial modeling task has been denoted to specific software 
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development. In particular a convolution algorithm has been implemented which 
effectively can simulate any scanner point spread function and the resultant 
output. Also, software is now available to simulate or add random noise, ac- 
cording to any specified signal- to-noise ratio, to the multispectral data. 
These two programs should complete the software package required in the Scan- 
ner System Parameter Selection task. 
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Appendix B-1. Information Theory Techniques for Analyzing Parameters of Multi- 
spectral Scanner Systems. 

I. Introduction 

In many remote sensing problems a multispectral scanning device is the 
major data collection system. The collected data is then processed in a 
manner that reflects the purpose for which it is to be used. An example of 
a typical use for a multispectral scanner is the Landsat earth resources 
satellite. The data gathered by the satellite is used to provide information 
about agricultural scenes, natural resources, land utilization and others. 

A particular use for the data is the classification of crops in agricultural 
areas. 

There are many parameters to be considered when a multispectral scanner 
system is selected to study a particular problem. The main area of interest 
for this study lies in the use of the multispectral scanner for agricultural 
purposes. Some of the parameters of interest are placement of spectral 
bands within the spectrum, spectral bandwidth, and signal-to-nolse properties. 
Other parameters are spatial resolution, spatial sampling methods, and utili- 
zation of ancillary data. Another parameter of interest is the specific 
types of scenes to be observed. 

At the present time, the studies concerning selection of multispectral 
scanner parameters have been mostly ad hoc and empirical. Landgrebe, Biehl 
and Simmons [1] have completed an extensive empirical study of multispectral 
scanner system parameters. In this study, several parameters were chosen and 
the resulting hypothesized multispectral scanner was simulated to judge its 
performance. The performance criteria used were classification accuracy and 
the root-mean-square (r.m.s.) error in proportion estimation. 

It would seem advantageous to develop analytical techniques to study and 
select multispectral scanner parameters. Very little work has been done in 
this respect until the present efforts at LARS. Currently there are two 
subtasks in the Scanner System Parameter Selection task concerning scanner 
modelling and another subtask on approximate evaluation of classification error 
for the multiclass classification problem. One of the scanner modelling sub- 
tasks approaches the problem from an information theoretic viewpoint. Dis- 
cussion of this informational viewpoint is the subject of the remainder of 
this paper. 

II. Informational Viewpoint 

Consider a scene as a source producing information in the form of its 
spectral reflectance response. If it is assumed that this response is a 
random process in wavelength, then information theoretic techniques may yield 
some useful results for determination of desirable multispectral scanner 
parameters. The multispectral scanner may be considered as a receiver ob- 
serving a noise corrupted version of the spectral response process. That 
is, assume a model for the observed spectral process of the form: 


z(A) = y(X) + n(A) 
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where 

z(X) is the observed spectral process; 

y(A) is the spectral reflectance process of the scene; 

n(.X) is noise disturbance process; 

and 

spectral observation Interval. 

Now, the noise process will cause some loss of information about the 
spectral scene. It is deslreable to minimize this information loss. Stated 
in another manner, it is desired to maximize the Information in z(A) about 
y(A). This may also be interpreted in an information theoretic sense. Sup- 
pose the spectral response y(X) is one of several possible classes Cj^, . . . 
Then the above result may be stated in terms of average mutual information in 

p(ylz.c^)~| I 

P(y|Cj) 


z(X) about y(X) for class : 

I(y;z|Cj) = / / p(y,z Cj) log 

Y Z 


/ N 

p(y|z,c_.) 
p(y|c ) 



■ dydt = E ■ 

log 

1 




= H(y|Cj) - H(y|z,Cj) , in the Interval 

where H(y|c.) and H(y|z,C.) are entropies. Thus it is desired to maximize the 

above expression. If each class C. occurs with probability P(C.), j=l,...,m; 

and ^ ^ 

m 

I P(C ) = 1; 
j=l ^ 



m 

I(y; zlc) = I P(C ) I(y; z|c.). 

j=l ^ ^ 


This may be interpreted as the average mutual information in z(X) about 
y(X) given the set of classes ^ = {Cj^, . . . Cj^}. Hence I(y; z|0 may be a 
useful concept to study for the observation of several classes of spectral 
s cenes . 

Practical multispectral scanners are usually designed in terms of spectral 
bands. Thus one avenue of study would be to use the above criterion to pro- 
vide an analytical method of choosing a set of spectral bands for a multi- 
spectral scanner. This approach will be carried out in the following manner. 
Divide the spectral response into several bands as shown below. 
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For many applications (l.e., classification) It Is desired to use a sub- 
set of the total possible spectral bands. It Is desirable to choose a sub- 
set of bands and still maximize the average mutual Information about the 
scene given by the subset. 

All of the above concepts of using mutual Information presuppose a use- 
ful method for computation of the mutual Information. It Is thought that 
such a technique has been developed. In order to use analytic techniques for 
computation It Is first necessary to develop adequate models for the spectral 
processes of Interest. The modeling techniques used are based on concepts 
of model Identification used In the area of time series analysis. A specific 
Identification technique that seems most useful Is the sequential Bayesian 
(and conditional maximum likelihood) technique. An excellent discussion of 
the technical details of the above method is given by Kashyap and Rao [2] . 

The basic idea of the method Is to Identify parameters for hypothesized models. 
The best model is then chosen according to an appropriate selection criterion. 
This model is then subjected to various validation tests. If the model passes 
the validation tests it may then be considered a valid model for the process. 
Failure of the validation tests Implies that perhaps one should search for 
another model. Models currently being studied for spectral processes are 
forms of autoregressive models and integrated autoregressive models. These 
models are fairly simple and seem to give reasonably good characterization of 
the spectral processes. Furthermore, these models are of a form that is 
readily amenable to computation of mutual Information in a received process 
z(X) about a spectral process y(A). An example of the form of a spectral model 
is as follows: 


An autoregressive (AR) model for the spectral response is 
hypothesized of the form: 

y(k) = a^y(k-l) + a 2 y(k- 2 ) + w(k) 


where 

Sf, a 2 are coefficients to be identified 

k is an integer that corresponds to a discrete wavelength 

w(k) are independent, identically distributed samples of 

a Gaussian random process. 
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The procedure is then to identify the coefficients a. and a^ from em- 
pirical data gathered about the scene. An example of sucn data would be de- 
tailed spectral response data for wheat. The model is then subjected to vali- 
dation tests which concern assumptions made on w(k) and similarity of statisti- 
cal characteristics of the model and the empirical data. 

A model of the form given above may be placed in a form that is more 
amenable to computation techniques. This form, known as a linear state vari- 
able form, is the modem conventional form for dynamic models. An outline of 
the procedure for obtaining such a form is given in reference [3]. Application 
of this technique to the model of the previous example is shown below. 


Define: 

x^(k) = y(k-l) = x^Ck-l) 

X 2 (k) = y(k) = a^y(k-l) + a 2 y(k- 2 ) + w(k) 


Or in matrix form: 


x(k) 


x^(k) 

X2(k) 



Xj^Ck-l) 

X2(k-1) 



Thus 


y(k) 


[0 1 ] 


Xj^(k) 

X2(k) 


The observed spectral process may then be written as: 


z(k) = y(k) + n(k) 

where n(k) is a sample of the noise process. 


As stated previously a model of this form will be useful in the calcula- 
tion of mutual information. Since the observation (by the multispectral 
scanner) is corrupted by noise there is uncertainty about the state, x(k) , 
of the process. It is clear that knowledge of the state of the process is 
equivalent to knowledge of y(k). Thus the quantity of interest is the mutual 
information between the state of the process and the estimate, x(k) , of the 
state of the process. Denote this mutual information as I(x, xJCj). It may 
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be shown that the procedure that maximizes this information also is the pro- 
cedure which minimizes the quantity where x = (x - x) [4]. It is well 

known that the Kalman filter procedure minimizes the covariance of the esti- 
mation error E[x^] for a model of the linear state variable form [5]. Con- 
sider the application of this result to the previous model. 

If y(k) = X 2 (k) 

and the estimate of y(k) based on z(k) is given by 
y(k) = X 2 (k) 

Then the maximum mutual information in z(k) about y(k) is 

given by : 

I(y,z|c^) = I(x2, X2|Cj) 

Thus we have a technique for determining the maximum mutual information in 
z(k) about y(k) (for each k) within the limitations of the model. Also it 
should be noted that the Kalman filter technique gives the optimal (in a mean 
square sense) estimate of the actual spectral reflectance process. The 
limitation of this technique seems to be primarily due to the limitations of 
the accuracy of the model of the spectral process. 

It has been shown by R. Y. Huang [6] that for a continuous Gaussian pro- 
cess the information in the observed process about the spectral response pro- 
cess between the wavelengths and X 2 is given by: 

1 '2 

I(Xi, X 2 ) = ^ I h(X,X)d 
^1 


where h(X,X) is the optimal Weiner filter for estimating y(X) from z(X). It 
can be shown that the optimal Kalman filter and the optimal Weiner filter give 
identical results [5]. The Kalman filter gain is written as a column vector 
of the same order as the state vector. However, due to the particular form 
of our state model, the only term of interest is the last element of the column 
vector. Also, since we are dealing with a discrete model, the integral in the 
preceding equation is replaced by a summation. Thus the above equation is 
replaced by a summation. Thus the above equation for mutual information may 
be written as 


i(y; z|cj = 


l k_(W 


kc [ ^ 2 ^ * ^2 ^ 
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where K^(k) Is the (n is the order of the state vector) element of the 
column vector Kalman gain expression. 

Thus a technique has been developed to compute mutual information for 
any portion (or band) of the spectral response process. The implication is 
clear. To select a subset of bands, it is necessary to compute the mutual 
information for each possible band and then choose the subset that gives the 
largest overall mutual information. For choosing a set of bands to observe a 
number of classes the problem is more complicated. Recall that in this case 
we are dealing with the mutual infonnation as given by: 

m 

I(y; z|c) =■ I P(C ) I(y; z|c.) 

j=l J ^ 


Thus the problem of finding an overall set of bands may require the use of a 
search routine on a computer. At any rate, the technique is clear. 

The relation of the above technique to performance of a multlspectral 
scanner in terms of classification accuracy is complicated. It appears, at 
the present time, that performance may have to be specified analytically only 
in terms of bounds on classification accuracy. This is a problem that remains 
to be pursued. 

III. Summary 

An overview of an analytic technique for studying some multlspectral 
scanner parameters has been given. In particular, an informational technique 
has been considered for selecting spectral bands for multlspectral scanners.. 
The relationship of the analytic parameters to classification accuracy re- 
mains to be considered in more detail. 
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